DiscoverHuggingFace 每日AI论文速递2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模
2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模

2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模

Update: 2025-09-30
Share

Description

本期的 15 篇论文如下:

[00:22 ] ⚡ SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention(SLA:通过可微调稀疏线性注意力突破扩散Transformer的稀疏性极限)

[01:05 ] 🗣 StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs(StableToken:一种面向韧性SpeechLLM的噪声鲁棒语义语音分词器)

[01:54 ] 🎮 Multiplayer Nash Preference Optimization(多玩家纳什偏好优化)

[02:57 ] 🔗 RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark(RealUnify:统一模型真的因“统一”而更强吗?综合基准揭晓答案)

[03:44 ] 🎨 OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing(OpenGPT-4o-Image:面向高级图像生成与编辑的大规模综合数据集)

[04:28 ] 🧠 Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR(超越探索-利用权衡:面向RLVR中LLM推理的隐状态方法)

[05:05 ] 🧩 Visual Jigsaw Post-Training Improves MLLMs(视觉拼图后训练提升多模态大模型)

[05:37 ] 🎬 SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer(SANA-Video:基于分块线性注意力Transformer的高效视频扩散生成模型)

[06:15 ] 🔬 Democratizing AI scientists using ToolUniverse(用ToolUniverse普及AI科学家)

[06:59 ] 🧠 When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance(推理何时真正奏效?对推理贡献度的受控研究)

[07:31 ] 📊 GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts(GSM8K-V:视觉语言模型能否解决视觉语境下的小学数学应用题?)

[08:04 ] 🖼 EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling(EditScore:借助高保真奖励建模解锁图像编辑在线强化学习)

[08:54 ] 🚀 SparseD: Sparse Attention for Diffusion Language Models(SparseD:面向扩散语言模型的稀疏注意力机制)

[09:40 ] 🎛 EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering(EasySteer:高性能可扩展LLM推理控制统一框架)

[10:32 ] 🧠 Towards Personalized Deep Research: Benchmarks and Evaluations(迈向个性化深度研究:基准与评估)

<figure></figure>

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模

2025.09.30 | SLA稀疏注意力砍算力;StableToken抗噪不训模